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Abstract 

Generalized empirical likelihood and generalized method of mo- 
ments are well spread methods of resolution of inverse problems in 
econometrics. Each method defines a specific semiparametric model 
for which it is possible to calculate efficiency bounds. By this ap- 
proach, we provide a new proof of Chamberlain's result on optimal 
GMM. We also discuss conditions under which GMM estimators re- 
main efficient with approximate moment constraints. 
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1 Introduction 

We tackle the problem of recovering an unknown probability measure \i 
based on a sample Xx, ■■, X n of i.i.d. realizations with distribution //, where 
additional information on \x is available in the form of a set of moments 
equations 

$(x)dfj,(x) = 0, 



for some vector valued function <3>. This kind of inverse problems finds many 
practical applications in econometrics, notably when dealing with instru- 
mental variables, see for instance Donald et al. (2009). In some cases, the 
function $ is not known exactly but is assumed to belong to some parametric 
family .), 9 G G C We are then interested in the estimation of the 

true value 9q of the parameter, which is, the zero of 9 i— > J .)dfi. The 
problem of estimating # in this context has been widely studied in the liter- 
ature. Two main methods of estimation have been implemented, namely the 
generalized method of moments (GMM), introduced in Hansen (1982) and 
the generalized empirical likelihood (GEL), developed in Qin and Lawless 
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(1994) for this particular context. 

Although these two methods aim to estimate the same quantity, we point out 
that they rely on different descriptions of the statistical model. Hence, each 
method is related to a specific semiparametric model, for which we can cal- 
culate the efficiency bound for estimating 9q, following van der Vaart (1998). 
By this approach, we exhibit necessary conditions for efficiency of GMM and 
we recover some known results of Hansen (1982) and Chamberlain (1987) on 
optimal GMM. 

In many actual situations, the function $ may have a complicated form that 
can only be evaluated numerically. Simulation-based methods have been im- 
plemented to deal with approximate constraints, see for instance Mcfadden 
(1989) and Carrasco and Florens (2000). In this paper we extend the GMM 
framework to situations where only an approximation $ m of the true con- 
straint function $ is available. We provide conditions under which GMM 
procedures remain efficient asymptotically when replacing $ by its approxi- 
mation. 

The article falls into the following parts. After exposing the model in Sec- 
tion [2J we make a brief survey on the main methods of estimation in this 
model, and provide a new proof on the semiparametric efficiency of GMM in 
Section [3j In Section HI we discuss the asymptotic efficiency of the method 
when dealing with an approximate constraint. Proofs are postponed to the 
Appendix. 



Let X be an open subset of M 9 , endowed with its Borel field B(X). We 
observe an i.i.d. sample Xi,...,X n with unknown distribution /i. We are 
interested in the estimation of a parameter 9 £ C M. d defined by the 
moment condition 



where $ : 6 x X -> R k {k > d) is a known map. The question of estimating 
efficiently 9q relies on the amount of information available on fi. Here, the 
information given by the moment condition ([2]) is used to determine the set 
A4 of possible values for /i (the model). The true value 9 of the parameter 
being unknown, the distribution of the observations can be any probability 
measure v for which the map 9 i— > F(6, v) is null for some value of 9 = 9{p) G 
0. The model is therefore defined as 



2 The model 




M = {u e V(X) : 39 = 9{v) e : F(9, v) = 0}, 
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where 6{v) is the parameter of interest. In these settings, we aim at calcu- 
lating the efficiency bound for estimating 9, following van der Vaart (1998). 
We make the following assumptions (||.|| denotes any norm of an Euclidean 
space). 

• Assumption 1: is a compact subset of M d . 

• Assumption 2: The map F(., fi) is continuous on 6 and has a unique 
zero 9 . Moreover, 9 lies in the interior of 6. 

• Assumption 3: For all x G X, the map 9 t— > <&(9,x) is continuous on 
and the map x (-> sup ege ||<I>(0, x)|| is bounded by some function k, 
integrable with respect to /i. 

• Assumption 4: For all x G X, 9 \— > <&(0,x) is twice continuously 
differentiable in a neighborhood Af of 9 . Moreover \\d$(9,x)/d0\\ and 
||<9 2 $(0, x)/d9d9 t || are continuous and bounded by an integrable func- 
tion in this neighborhood (d&(9, .)/dd will be noted V<&(0, .) in the 
sequel) . 

• Assumption 5: The matrices D := J V^(9 ,x)djj,(x) G M. dxk and 
V :— J &(9 ,x)§ t (9 ,x)dfj,(x) G R kxk are of full rank. 

These assumptions are usual conditions for this problem, see for instance Qin 
and Lawless (1994). They ensure the unicity of the parameter 9{u) (which, 
we recall, is defined as the zero of F(., v)) when v is close enough to \x for the 
total variation topology, and then allow a proper definition of the parameter 
of interest in the neighborhood of fi. 

We can now calculate the efficiency bound for estimating 9 in this model. 
For this, we need the following definitions. 

Definition A model > 0} with /i — fj, is differentiable in quadratic 

mean at \i if there exists g : X — >■ M such that f x g 2 d\i < oo and 



lim 



X 




2 

dr t = 0, 



setting for all t > 0, r t = fi t + A*- 

The function g is called the score of > 0}, it satisfies J gdfi = 0. In 

the next definition, for all function T n : X n — > Q of the observations, we 
denote by C(T n \ v) the law of T n (X 1 , X n ) assuming that X 1 ,...,X n are 
independent with distribution v. 
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Definition An estimator 9 = 9(X 1 , X n ) of a parameter 9 : M — > 6 is 
locally Gaussian regular if for all differentiable submodel {/it,£ > 0} C Ai 
with fiQ = fi and for all positive sequence (t ra )neN such that \fnt n is bounded, 
C(\/n(9 — 9(fi tn ))\ fj, tn ) converges weakly towards a Gaussian distribution as 
n — > oo. 

In a given model, the efficiency bound for estimating a parameter 9$ is 
to be understood as a lower bound for the asymptotic variance of locally 
Gaussian regular estimators of 9 . An efficiency bound is calculated by con- 
sidering Fisher Informations of differentiable submodels. We refer to Bickel 
et al. (1994) and van der Vaart (1998) for further details. 

Theorem 2.1 (Theorem 3, Qin and Lawless (1994)) Suppose that As- 
sumptions 1 to 5 hold. The efficiency bound in this model for estimating 9q 
is 

B = [DV^D 1 ]' 1 . 

Once we have calculated the efficiency bound in our model, the objective is 
to build an estimator 9 of 9 for which the efficiency bound is reached, at 
least asymptotically in the sense that 

lim n var(#) = B. 

n— s-oo 

In some cases, there may not exist any locally Gaussian regular estimates 
achieving the bound, see for instance examples in Ritov and Bickel (1990). 
It may also exist estimators having an asymptotic variance smaller than the 
efficiency bound, in which case the required regularity conditions are not 
satisfied, as seen in Chapter 2 in Bickel et al. (1994). Such situations will 
not occur here, as we assume regularity conditions on the model under which 
GMM and GEL procedures yield regular estimates. 

3 Estimation of the parameter 

For this problem, we may adopt two natural, although seemingly different, 
procedures to estimate 9 Q , following Chapter 3 in Bickel et al. (1994). Let 

= \ Z^r=i denote the empirical distribution, where 5 stands for the 
Dirac measure. 

• Procedure 1: Find a "smooth" extension 9 of 9 over a larger set 
V ~D M. of probability measures containing the empirical distribution 
ji n and define the estimator as 9 = 9(fi n ). 
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• Procedure 2: Build an approximation jl of [i lying in the model M 
and define the estimator as 9 = 9{fi). 

In the literature, two main methods have been implemented for this prob- 
lem, each one providing a good illustration of each procedure. 

3.1 Generalized method of moments 

The generalized method of moments (GMM) was introduced in Hansen 
(1982). The method consists in replacing in the moment constraint the true 
measure \x by its empirical approximation /i n . Then, find the value of 9 for 
which F(9,p, n ) = - ^2™ =1 3>(0, Xi) is as close as possible to according to 
a given euclidean norm of M. k . Precisely, define for M a symmetric positive 
definite k x k matrix and a e M fc , \\a\\ 2 M = a 1 Ma, the GMM estimator 9 of 
#o associated to the norm ||.||m is given by 

9 = argmin \\F(9, fx n )\\ M - 

see 

In practice, the matrix M may have a dependency in n, in which case it is 
chosen to converge towards a symmetric positive definite matrix. However, 
replacing the matrix by its limit leads to the same first order asymptotic 
properties of the estimate, under regularity conditions, as pointed out in 
Newey and Smith (2004). Here, we will assume for simplicity that M is 
fixed, this being sufficient for our purposes. 

The generalized method of moments is a good illustration of the first 
procedure, as the GMM estimator 9 can be seen as the image of the empirical 
distribution ji n by the function 

9 M (u) = argmin \\F(9,u)\\ M , v eV, 

where V is an extension of the original model Ai, containing /i n . For sake 
of generality, V is taken as the set of all probability measures v for which 
F(.,u) can take finite values on 6. Because is compact, V does not depend 
on the scaling matrix M. 

This procedure may seem inefficient at first. Indeed, extending the pa- 
rameter over a larger model V implicitly increases the size of the model, and 
thus decreases the information available. To be able to provide an efficient 
estimation, the extension 9m must be " smooth" enough so that differentiable 
submodels in V carry at least as much information as the original model. Ba- 
sically, we want the efficiency bound Bm for estimating 9m over V not to be 
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higher than the original bound B. Since it obviously can not be lower, the 
objective is to find an efficient extension, for which Bm = B. 

Theorem 3.1 Suppose that Assumption 1 to 4 hold. The efficiency bound 
for estimating 9m in V is 

B M = [DMDY 1 [DMVMD 1 ] [DMD'f 1 . 

This result was originally shown in Chamberlain (1987), although we propose 
in the Appendix a different proof, based on modern tools on semiparametric 
efficiency theory. 

As expected, the efficiency bound B M in the extended model V is larger 
than in the original model M. (see Lemma I5TT1 in the Appendix). The asymp- 
totic variance of the GMM estimator is precisely the lower bound Bm, as 
shown in Hansen (1982), which proves the efficiency of the method. The 
theorem also covers the results of Hansen (1982) and Chamberlain (1987) on 
optimal GMM for M = V~ l , leading to an efficiency bound in the extended 
model that is equal to the original bound B of Theorem 12.11 

Note that the matrix V is generally unknown, since it depends on both \x 
and 6q. In this case, it is replaced by a consistent estimate V, leading to the 
same asymptotic properties under regularity conditions. Here again, several 
approaches are possible. 

In the two-step GMM procedure, the estimate V is built using a preliminary 
estimator 9 of 9q obtained by a GMM procedure with known scaling matrix 
(in general, the identity matrix). As a result, 9 is not in general asymp- 
totically efficient, however, it is A/n-consistent and enables to construct a 
consistent estimate of V . 

Another solution is to minimize simultaneously over G 

9 i->- F(9, fi n ) t V~ 1 (9)F(9, fi n ), (3) 

where V~ 1 (9) denotes here an arbitrary consistent estimate of V~ l (9), for all 
G 0. The latter approach was introduced in Hansen et al. (1996) as the 
continuous updating estimation (CUE). 



3.2 Generalized empirical likelihood 

Generalized empirical likelihood (GEL) was first applied to this problem 
in Qin and Lawless (1994), generalizing an idea of Owen (1991). This method 
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is an application of the first procedure. An estimate fx of fx is obtained as 
an entropic projection (in a general sense defined below) of the empirical 
measure fx n onto the model Ai. Hence, the measure fx is the element of 
the model that minimizes a given /-divergence T>f(fx n , .) with respect to the 
empirical distribution. Let us recall some definitions. 

Definition Let / be a strictly convex function with /(l) = f'(l) = 0, and 
let P, Q be two probability measures on X. The /-divergence of Q with 
respect to P is defined as 

V f (P, Q)= J f (jp) dP if Q « P, V f (P, Q) = oo otherwise. 

A /-divergence measures the "closeness" between two probability measures. 
It is non negative and is null only if P — Q. This definition can be extended 
to sets of measures by noting for S a subset of V(X), 

V f (P,S)=MV f (P,Q). 

Definition We call entropic projection of v on S associated to /, an element 
v* E S such that T>f(u, S) = Vf(v, u*) < oo. 

An entropic projection always exists as soon as S is closed for the total vari- 
ation topology and T>f(v,S) is finite. Furthermore, it is unique if S is also 
convex (see Csiszar (1967)). 

Setting for a fixed 9 E 6, Me ■= {v E V(X) : F(9, v) = 0}, the model can 
be written as M = VJe^e-M-e- Thus, the GEL estimator 9 = 9 (fx) follows by 

9 = argmin V f (fx n ,M e )- 

Since M.g is closed and convex, the entropy T>f(fx n , A4e) is reached for a 
unique measure fx(9) in M.e, provided that T>f(fx n , A4e) is finite. Then, it 
appears that computing the GEL estimator involves a two-step procedure. 
First, build for each 9 E 0, the entropic projection fx(9) of fx n onto Me- 
Then, minimize Vf(fx n , fx(9)) with respect to 9. Since fx(9) is absolutely 
continuous w.r.t. fx n by construction, minimizing T)f(fx n , .) reduces to find 
the proper weights pi, ...,p n to allocate to the observations X±, ...,X n . This 
turns into a finite dimensional problem, which can be solved by classical 
convex optimization tools (see for instance Kitamura (2006)). Finally, the 
GEL estimator 9 can be expressed as the solution to the saddle point problem 

9 = argmin sup X, --f^f*^ + X t 2 ^(9,X i )), 
dee (A 1 ,A 2 )eK fc + 1 n ~T 
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where f*(x) = sup y {xy — f(y)} denotes the convex conjugate of /. 

Note that if the choice of the /-divergence plays a key role in the con- 
struction of the estimator, it has no influence on its asymptotic efficiency 
Indeed, Qin and Lawless (1994) show that all GEL estimators are asymp- 
totically efficient, regardless of the /-divergence used for their computation. 
Nevertheless, many situations justify the use of specific /-divergences. In 
its original form, empirical likelihood (EL) estimator in Owen (1991) uses 
the Kullback entropy K(.,.) as /-divergence, pointing out that minimizing 
K(fi n , .) reduces to maximizing likelihood among multinomial distributions. 
Newey and Smith (2004) remark that a quadratic /-divergence leads to the 
CUE estimator of Hansen et al. (1996). Many choices of /-divergence can 
also be given a Bayesian interpretation, using the maximum entropy on the 
mean (MEM) approach, as shown in Gamboa and Gassiat (1997). 

4 Dealing with an approximate constraint 

In many actual applications, only an approximation of the constraint 
function is available. This may occur if the moment conditions take com- 
plicated forms that can only be evaluated numerically or by simulations. 
Mcfadden (1989) suggested a method dealing with approximate constraint 
in a similar situation, introducing the method of simulated moments (see also 
Carrasco and Florens (2000)). In Loubes and Pelletier (2008) and Loubes and 
Rochet (2009), the authors study a MEM procedure for linear inverse prob- 
lems with approximate constraints. Here, we propose to extend the GMM 
framework to a situation with approximate moment conditions. We assume 
that we observe a sequence ($ m (6 l , .)) m6 N of approximate constraints, inde- 
pendent with the original sample X 1? ...,X n . We are interested in exhibiting 
sufficient conditions on the sequence ($ m (#, .)) m under which estimating 9q 
with GMM procedures remains efficient when the constraint is replaced by 
its approximation. We discuss the asymptotic properties of the resulting es- 
timates in a framework where both index n and m simultaneously grow to 
infinity. 

In the sequel, we note W(9) the inverse of the covariance matrix of $(#, X), 

W{9) = [f $(0, .)&(0, .)dfi - f $(0, .)dn f &(6, .)dfi] ~\9eQ, 

while W(9) denotes an arbitrary consistent estimator of W(9), built from the 
observations and the constraint function $(#,.). In the same way, W m (9) and 
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W m {9) are defined by replacing $ by its approximation <3> m in the expressions 
of W(9) and W{9) respectively. 

For E, an Euclidean space endowed with a norm ||.||, a function / : — >■ E 
and (SC8, note 

11/11* = sup ||/(6>)||. 

e&s 

We make the following assumptions, where we recall that M is a neighbor- 
hood of 9o defined in Assumption 4. 

• Assumption 6: ||$(., ar)|| e , ||V$(.,a:)||^ and ||<9 2 $(., x)/d9d9 t \\ N are 
dominated by a function such that f K 12 (x)dfi(x) < oo. 

• Assumption 7: For (y? m ) me N a given sequence tending to infinity the 
functions <p m \\$ m (.,x) - $(.,x)|| and <p m ||V$ m (., x) - V$(., are 
dominated by a function re m (x) such that sup m J K^{x)d^{x) < oo. 

• Assumption 8: The random map 9 t-> W{9) is differentiable on J\f 

andE(v^||VF-Wie) 6 , E(v^ll ^W-VW\\ N f andE(y m ||W m -W , || e ) 3 
are bounded as m,n range over N. 

Approximate GMM estimation consists in minimizing over G 

h. L(o) = [J & m (e, .)dfx n ] w m [j $ ro (0, .)^n] , 

where is a random matrix with properties to be specified below. It 
appears that the accuracy of approximate GMM relies on how close the ap- 
proximate contrast function £ m is to its true value (i.e. when the constraint 
function is known). In this purpose, the scaling matrix W m should be chosen 
as close as possible to the optimal choice W = W(9 ). 

As in the situation where the constraint function is known, the two-step 
GMM procedure provides a natural way to compute the scaling matrix W m . 
First build a preliminary estimator 9 m , minimizing over 

H- i m (9) = [f & m (9, .)d» n ] [f $m(9, .Wn] , 

which corresponds to a GMM procedure with identity scaling matrix. Then, 
define W m = W m (9 m ) which is used as scaling matrix in the contrast func- 
tion £ m . The resulting approximate two-step GMM estimator satisfies good 
asymptotic properties as soon as the approximate function <3> m converges fast 
enough towards <3>, as proved in the following theorem. 
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Theorem 4.1 (Robustness of two-step GMM) Denote by 9 rn and 9 the 

two-step GMM estimators obtained respectively with the constraint functions 
$ m and <f>. If Assumptions 1 to 8 hold, 

nE(\\6 m -6\\) 2 = 0(n^ 2 ) + o(l). 

In particular, 9 rn is y/n- consistent and asymptotically efficient ifn/ip^ tends 
to zero. 

In the same way, the CUE procedure can be adapted to the case with 
approximate constraint. Although, the robustness of CUE with approximate 
constraint requires slightly stronger assumptions. 

• Assumption 9: W(.) and W(.) are twice continuously differentiate 
on M and \/r] > 0, F^W/dOdO 1 - (fW/dOdO^M > rj) = o(n- 1 ). 
Besides, W m {.) is differentiate on Af and E(</? m || VW m - VJU||at) 3 is 
bounded as to, n range over N. 

Applying the procedure to the approximate constraint, the approximate CUE 
estimator follows by minimizing over 

o H- Cm(o) = [I -Wn] w m {e) [f a> m (o, .)dfi n ] . 



Corollary 4.2 (Robustness of CUE) Denote by 9 m and 9 the CUE es- 
timators obtained respectively with the constraint functions <J> m and $. If 
Assumptions 1 to 9 hold, 

nE(\\6 m -6\\) 2 = 0(n Vm 2 ) + o(l). 

In particular, 9 m is y/n- consistent and asymptotically efficient ifn/ip^ tends 
to zero. 

5 Appendix 

5.1 Technical lemmas 

Lemma 5.1 For all symmetric positive-definite matrix M, 
DMD 1 [DMVMD 1 ]' 1 DMD f < DV^D 1 , 
with equality for M = V^ 1 . 
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Proof. Set A = V 1 ^ 2 MD t , ALA*A] _1 A* is an orthogonal projection matrix 
with in particular A^A] -1 ^' < Id. The inequality holds after multiplying 
each term by DV' 1 ^ 2 on the left and V^^D 1 on the right, proving the result. 

Lemma 5.2 Let f : — > R be a continuous positive function with a unique 
zero 9q lying in the interior of the compact set and with positive definite 
Hessian matrix at 9 . Assume that f is twice continuously differentiable on a 
neighborhood Af of9 . Let (f n ) n eN be a sequence of positive random functions, 
twice continuously differentiable on Af, converging in probability towards f . 
Note U = d 2 f/d6d6 t and U n = d 2 f n /d9d9 t . Moreover, for all n G N, let 
(fm,n)mm be a sequence of positive random functions converging towards f n 
as m — > oo. Denote by Q m n and 9 n a minimizer of f m ^ n and f n respectively. 
If the following conditions are met 

i) W V > 0, P(||/ n - /||e >v) = oin- 1 ) and F(\\H n -H\W>rj) = o(n _1 ), 

ii) the f m>n are differentiable on Af and C x = sup mn E(</? m ||/ m)n - / n || ) p 

and C*2 = sup mri E((/9 m || V/ mj „ — Vf n \\j^) p are finite for a p > and a 
sequence (y?m)meN tending to infinity, 

then, there is a constant K > such that 

nO m ,n-e n \\ p <K V J + o{n- 1 ). 

Proof. By continuity of H around 9 , we may assume without loss of general- 
ity that Af is such that V.(9) has all its eigenvalues larger than some constant 
2c > for all 9 G Af. Note p n the smallest eigenvalue of T-L n (6) as 9 ranges 
over Af. The uniform convergence of "H„ on Af in condition i) ensures that 
P(p n < c) = o(n~ r ). Besides, since 9q is the unique zero of / on the com- 
pact set 0, we can find a constant r/i > such that 9 n lies in Af as soon as 
Wfn— /lie < Vi- Hence, still by condition i), F(9 n ^ Af) = o(n~ l ). In the same 
way, there is a constant r] 2 > such that P(0 m , n ^ Af) < P(||/ m , n — / lie > 
2^2), with 

P(||/ m ,„-/||e> 27 72 ) < P(||/ m>n -/ n ||e + ||/„- /lie > 2%) 

< P(||/ m ,„ - /n||e > V2) + P(||/n - /lie > V2) 

< d(^ 2 )- p + o(n- 1 ), 

by Chebyshev's inequality. Call Q the intersection of the three events {9 n G Af}, 
{9 m ,n e and {p n > c}, we get P(fi c ) < Cifawfc)"* + oipT 1 ), where fi c 
denotes the complementary of Q. Moreover, on Q, we have 

||v/ m ,„ - vf n y > ||v/ n (0 m>n )|| > c\\e m , n - 9 n \\. 
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Let 5 be the diameter of G, it follows that 

E\\9 mtn -e n \\* < c-PE\\Vf m , n -Vf n \\ p M + 5PF(Q c ) 
< K^ + oin- 1 ), 

for K = C^/ri + C 2 /d>. 
5.2 Proofs 

Proof of Theorem I3.lt Note T the set of bounded functions with zero 
mean under \i. For any g G T and t > 0, the measure fit '■= (1 + tg)fi lies in 
V provided that t is small enough. The path {fi t , t > 0} is thus differentiable 
with score g. 

The uniform convergence of F(.,fi t ) towards F(.,fi) (which follows from As- 
sumptions 1 and 2) ensures the existence of a minimizer 8(t) of F(.,fi t ) 
continuously close to 8q as t — > and satisfying the first order condition 
j M (9(t),iJ, t ) = where 

1m{0, v) = [/(V$(0, .)H M [/ $(0, .)dv] ,{6,v)eQx V. 

Under Assumptions 2 to 4, the implicit functions theorem applied to the 
map (6,t) i — y 7M(^,/^ t ) in a neighborhood of {0q,0) warrants the unicity of 
the minimum 6{t) = #m (/-**)■ 

Note I = ...,ldY the efficient influence function of 9m- By a Taylor ex- 
pansion of $e at 6> and using that 7A/(#M(A i i), Ht) = 0, we get 

[fv<5> eo dii t }M[[f$ eo (i + tgW} + [fwidiH] (e M (nt) - e )] = o(t). 

Since 6m{^i) —@o = t J lgdfi + o(t) by definition of I, we obtain after dividing 
each term by t and making t tend to zero 

DM [J <S> eo gdpL] = -DMD\j igdy). 

Since this holds for all g G T, we conclude that 

i() = -[DMD t \- x DMa>M, 

checking beforehand that I lies in the closure of T. The efficiency bound is 
the variance of l(X) which proves the result. 
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Proof of Theorem SHI For all 9 e 6, let 

a{9) = J 9(0,.)dii, m = J \/m.)dfi, 7 (0) = J dg^'W 

Besides, note d(0) the empirical estimate of a(0) and a m (9) the estimate 
built with $ m and define /3(0), 7(6'), (3 m (9) and 7 m (6 l ) analogously. 
First, prove that E(||0 TO - 9\\ 6 ) = 0(^" 6 ) + o(n _1 ). It suffices to verify the 
conditions of Lemma I5T21 for p = 6, taking / n = £ = cm*, / mjTl = £ m = d m a4, 
and / = aa ( . In this particular case, we have % n = d 2 ^/dddd t = 2/3/3* + 270; 
and "H = 2/3/3* + 2 7 a. 

First note that %{6q) = 2(3 t (9 )(3(9o) is positive definite by Assumption 5. 
Furthermore, a(9) is asymptotically normal and since \\*&(9, .)|| is dominated 
by a square integrable function k on G, we have, for all i] > 0, 

P(||q - a|| e > v) = oin- 1 ). 

By assumption, the same argument holds for \\(3 — (3\\n and || 7 — 7||at- Con- 
dition i) in Lemma [5.21 follows directly, noticing that 

Hn—H — 20 - (3)0 + (3) 1 + 2(7 - 7) a + 2 7 (d - a). 

Moreover, ||| m - ||| e = Ud^dm - d*d|| e < ||d m + d|| ||d m - d|| e , yielding 

®(<Pm\\L - ih) 6 < [E&mW&m ~ «||e) 12 ] 1 [E(||d m + d||©) 12 ] * 

by Cauchy-Schwarz inequality Thus, E(<£> m ||£ m — £||e) 6 is finite by Assump- 
tions 6 and 7. Since V£ m = 2(3 m a m and V£ = 2/3d, assumptions also warrant 
that E(y9 m ||V£ m — V^Hat) 6 < oo. Lemma I5T21 then gives 

\\~e m -~e\\' = o(^) + o(n- 1 ). 

To show the result, we shall now verify that the conditions of Lemma 15.21 
hold for p = 2 with the functions / TO)Tl = £ m , f n — £, / = £. We now consider 
H„ = 2pWft + 2 7 W>d and U = tyWop" + 2 7 iy a where W = W{9) and 
W = W(9 ). 

The Hessian matrix 7i(9 ) = 2(3(9 )Wof3 t (9 ) is positive definite by Assump- 
tion 5. For condition i) of Lemma 15.21 to be satisfied, we need that for all 
r] > 0, F(\\W - W Q \\ > rj) = o{n- 1 ). Since P(0 i M) = o^ 1 ), we shall 
only consider the case where 9 G N '. By the triangular inequality, we get 
\\W{9) - W 1| < \\W{9) - W(9 )\\ + \\W(9 ) - W \\ and we use that 

n\\W - W \\ > V )< F(\\W(9) - W{9 )\\ > |) + P(||W(0„) - Woll > |). 
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Assumption 8 gives P(||W r (^ ) — Wo|| > rj/2) = o{n~ l ), using Chebyshev's 
inequality. Furthermore, || W{9) — W(9 ) \\ < \\ VW\\j^\\8 — 9 \\ for a suitable 
norm in W lxkxk and for K > E\\VW\\n, 

F(\\vwy\\e - 9 \\ > |) < n\\e -e \\>^-) + f(\\vw\\ m >k) = o^- 1 ) 

which ensures condition i) of Lemma [5.21 Write 

im-i= («m - aYW m a m + ab(W m - W)& + (a m - afWa 

where each term can be controlled using Holder's inequality, as we have for 
the middle term 



E(<^ m ||c^(W> m - W)a\\ e ) 2 < E(||a m || ^ m ||^ m - W\\ ||d||©) 2 

2 



< [E(||« m || e ||a||e) 6 ] 5 E{y m \\W m -W\\) 



3 

J 



for an appropriate norm in M. kxk for the matrix W m — W. Apply the same 
procedure for the two other terms, with for instance 



yw m a m \\e) 2 < E(^ m ||d m -d||e||W m ||||d||e] 



2 



m||d m d||0||d||@) 1 



E\\W„ 



To have sup mn E((^ m ||^ m — £||e) 2 < °°> it suffices to show E(y9 m ||W ; m — W^H) 3 
is bounded as n and m range over N, since the rest follows from the first part 
of the proof. This is true as soon as 9 m and 9 both lie in M as we have on 
the event Q = {9, 9 m G A/"}, 

\\W m -W\\ < \\W m (9 m )-W(9 m )\\ + \\W(L)-W(9)\\ 
< \\W m - W\\e+\\VW\U\\9 m -0\\ 



and the result follows from Assumption 8 and by Cauchy-Scharz inequality, 
since both y m ||0 m — 9\\ and || VW^||_v have finite moments of order 6. Hence, 

sup E(ip m \\£ m - iWe^-n) 2 < oo. 

n,mgN 

The same reasoning leads to the same conclusion for V£ m on Af, namely 
sup E(ip m \\Vi m - ViW^ln) 2 < oo. 

n,m€N 

Following the proof of Lemma 15.21 we show that the complementary of Q 
occurs with negligible probability as P(f2 c ) = 0(<£>~ 6 )+o(n _1 ). Since \\9 m — 9\\ 
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remains bounded on Q c , we conclude that E(||#, m — 0||l n c) 2 = o(<^ m 2 )+o(n 1 ), 
yielding 

n\\e m -e\\f = o( v - m 2 ) + o(n- 1 ). 

Proof of Corollary 14.21 The proof is the same as for Theorem 14.11 we 
show that the conditions of Lemma I5T21 are satisfied for / m ri = ( m = a^Wam, 
f n = C = & f Wa and / = aWa. Condition i) follows from Assumptions 6 and 
9, and E(</? m ||VC m — VCHat) 2 can De bounded as in the proof of the theorem, 
using the additional condition that E(yj m || VW m — VW / ||a/') 3 is bounded. 
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